In [1]:
%load_ext autoreload
%autoreload 2
import scdiff_init
import os
import subprocess
import CSHMM_analysis
%matplotlib inline

Model initialization with scdiff: http://www.cs.cmu.edu/~jund/scdiff/

In [2]:
data_file="treutlein2014"
In [3]:
scdiff_init.run_scdiff_init(data_file)
start clustering...
calculating affinity matrix ...
cell:0
cell:1
cell:2
cell:3
cell:4
cell:5
cell:6
cell:7
cell:8
cell:9
cell:10
cell:11
cell:12
cell:13
cell:14
cell:15
cell:16
cell:17
cell:18
cell:19
cell:20
cell:21
cell:22
cell:23
cell:24
cell:25
cell:26
cell:0
cell:1
cell:2
cell:3
cell:4
cell:5
cell:6
cell:7
cell:8
cell:9
cell:10
cell:11
cell:12
cell:13
cell:14
cell:15
cell:16
cell:17
cell:18
cell:19
cell:20
cell:21
cell:22
cell:23
cell:24
cell:25
cell:26
cell:27
cell:28
cell:29
cell:30
cell:31
cell:32
cell:33
cell:34
cell:35
cell:36
cell:37
cell:38
cell:39
cell:40
cell:41
cell:42
cell:43
cell:44
cell:45
cell:46
cell:47
cell:48
cell:49
cell:50
cell:51
cell:52
cell:53
cell:54
cell:55
cell:56
cell:57
cell:58
cell:59
cell:60
cell:61
cell:62
cell:63
cell:64
cell:65
cell:66
cell:67
cell:68
cell:69
cell:70
cell:71
cell:72
cell:73
cell:74
cell:75
cell:76
cell:77
cell:78
cell:79
learning K...
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/cluster/spectral.py:433: UserWarning: The spectral clustering API has changed. ``fit``now constructs an affinity matrix from data. To use a custom affinity matrix, set ``affinity=precomputed``.
  warnings.warn("The spectral clustering API has changed. ``fit``"
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
10.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
20.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
30.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
40.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
50.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
60.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
70.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
80.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
90.0%
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
100.0%
learning clustering seeds...
seeds:0
seeds:1
seeds:2
seeds:3
seeds:4
seeds:5
seeds:6
seeds:7
seeds:8
seeds:9
seeds:10
seeds:11
seeds:12
seeds:13
seeds:14
seeds:15
seeds:16
seeds:17
seeds:18
seeds:19
seeds:20
seeds:21
seeds:22
seeds:23
seeds:24
seeds:25
seeds:26
seeds:27
seeds:28
seeds:29
seeds:30
seeds:31
seeds:32
seeds:33
seeds:34
seeds:35
seeds:36
seeds:37
seeds:38
seeds:39
seeds:40
seeds:41
seeds:42
seeds:43
seeds:44
seeds:45
seeds:46
seeds:47
seeds:48
seeds:49
seeds:50
seeds:51
seeds:52
seeds:53
seeds:54
seeds:55
seeds:56
seeds:57
seeds:58
seeds:59
seeds:60
seeds:61
seeds:62
seeds:63
seeds:64
seeds:65
seeds:66
seeds:67
seeds:68
seeds:69
seeds:70
seeds:71
seeds:72
seeds:73
seeds:74
seeds:75
seeds:76
seeds:77
seeds:78
seeds:79
seeds:80
seeds:81
seeds:82
seeds:83
seeds:84
seeds:85
seeds:86
seeds:87
seeds:88
seeds:89
seeds:90
seeds:91
seeds:92
seeds:93
seeds:94
seeds:95
seeds:96
seeds:97
seeds:98
seeds:99
seeds:0
seeds:1
seeds:2
seeds:3
seeds:4
seeds:5
seeds:6
seeds:7
seeds:8
seeds:9
seeds:10
seeds:11
seeds:12
seeds:13
seeds:14
seeds:15
seeds:16
seeds:17
seeds:18
seeds:19
seeds:20
seeds:21
seeds:22
seeds:23
seeds:24
seeds:25
seeds:26
seeds:27
seeds:28
seeds:29
seeds:30
seeds:31
seeds:32
seeds:33
seeds:34
seeds:35
seeds:36
seeds:37
seeds:38
seeds:39
seeds:40
seeds:41
seeds:42
seeds:43
seeds:44
seeds:45
seeds:46
seeds:47
seeds:48
seeds:49
seeds:50
seeds:51
seeds:52
seeds:53
seeds:54
seeds:55
seeds:56
seeds:57
seeds:58
seeds:59
seeds:60
seeds:61
seeds:62
seeds:63
seeds:64
seeds:65
seeds:66
seeds:67
seeds:68
seeds:69
seeds:70
seeds:71
seeds:72
seeds:73
seeds:74
seeds:75
seeds:76
seeds:77
seeds:78
seeds:79
seeds:80
seeds:81
seeds:82
seeds:83
seeds:84
seeds:85
seeds:86
seeds:87
seeds:88
seeds:89
seeds:90
seeds:91
seeds:92
seeds:93
seeds:94
seeds:95
seeds:96
seeds:97
seeds:98
seeds:99
clustering for time: 14.0
clustering for time: 16.0
clustering for time: 18.0
building graph...
time adjustment...
connecting nodes ....
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
G1
ID:  14.0_0
index:  0
P index:  None
ST:  0
T:  0
cell.ID:  E14_1_C44
cell.ID:  E14_1_C68
cell.ID:  E14_1_C84
cell.ID:  E14_1_C70
cell.ID:  E14_1_C90
cell.ID:  E14_1_C01
cell.ID:  E14_1_C25
cell.ID:  E14_1_C51
cell.ID:  E14_1_C10
cell.ID:  E14_1_C06
cell.ID:  E14_1_C69
cell.ID:  E14_1_C89
cell.ID:  E14_1_C75
cell.ID:  E14_1_C16
cell.ID:  E14_1_C31
cell.ID:  E14_1_C76
cell.ID:  E14_1_C78
cell.ID:  E14_1_C74
cell.ID:  E14_1_C14
cell.ID:  E14_1_C50
cell.ID:  E14_1_C83
cell.ID:  E14_1_C95
cell.ID:  E14_1_C65
cell.ID:  E14_1_C54
cell.ID:  E14_1_C39
cell.ID:  E14_1_C02
cell.ID:  E14_1_C85
cell.ID:  E14_1_C94
cell.ID:  E14_1_C09
cell.ID:  E14_1_C46
cell.ID:  E14_1_C60
cell.ID:  E14_1_C22
cell.ID:  E14_1_C05
cell.ID:  E14_1_C18
cell.ID:  E14_1_C73
cell.ID:  E14_1_C57
cell.ID:  E14_1_C77
cell.ID:  E14_1_C81
cell.ID:  E14_1_C87
cell.ID:  E14_1_C61
cell.ID:  E14_1_C79
cell.ID:  E14_1_C66
cell.ID:  E14_1_C96
cell.ID:  E14_1_C56
cell.ID:  E14_1_C80
ID:  16.0_0
index:  1
P index:  0
ST:  16.0
T:  1
cell.ID:  E16_1_C90
cell.ID:  E16_1_C04
cell.ID:  E16_1_C94
cell.ID:  E16_1_C68
cell.ID:  E16_1_C72
cell.ID:  E16_1_C01
cell.ID:  E16_1_C06
cell.ID:  E16_1_C14
cell.ID:  E16_1_C40
cell.ID:  E16_1_C65
cell.ID:  E16_1_C66
cell.ID:  E16_1_C54
cell.ID:  E16_1_C24
cell.ID:  E16_1_C84
cell.ID:  E16_1_C67
cell.ID:  E16_1_C95
cell.ID:  E16_1_C31
cell.ID:  E16_1_C88
cell.ID:  E16_1_C49
cell.ID:  E16_1_C92
ID:  18.0_4
index:  2
P index:  0
ST:  18.0
T:  1
cell.ID:  E18_3_C51
cell.ID:  E18_3_C75
cell.ID:  E18_3_C13
ID:  18.0_2
index:  3
P index:  0
ST:  18.0
T:  2
cell.ID:  E18_1_C18
cell.ID:  E18_1_C15
cell.ID:  E18_2_C77
cell.ID:  E18_3_C40
cell.ID:  E18_1_C47
cell.ID:  E18_3_C53
cell.ID:  E18_1_C53
cell.ID:  E18_3_C36
cell.ID:  E18_3_C15
cell.ID:  E18_1_C46
ID:  18.0_0
index:  4
P index:  1
ST:  18.0
T:  2
cell.ID:  E18_2_C73
cell.ID:  E18_2_C88
cell.ID:  E18_1_C84
cell.ID:  E18_3_C37
cell.ID:  E18_2_C43
cell.ID:  E18_2_C87
cell.ID:  E18_3_C61
cell.ID:  E18_2_C42
cell.ID:  E18_3_C09
cell.ID:  E18_1_C34
cell.ID:  E18_2_C22
cell.ID:  E18_3_C54
cell.ID:  E18_3_C10
cell.ID:  E18_2_C15
cell.ID:  E18_1_C37
cell.ID:  E18_3_C59
cell.ID:  E18_2_C44
cell.ID:  E18_2_C26
cell.ID:  E18_1_C33
cell.ID:  E18_3_C87
cell.ID:  E18_2_C12
cell.ID:  E18_2_C30
cell.ID:  E18_1_C09
cell.ID:  E18_3_C29
cell.ID:  E18_2_C36
cell.ID:  E18_2_C31
cell.ID:  E18_2_C50
cell.ID:  E18_3_C28
cell.ID:  E18_2_C13
cell.ID:  E18_3_C47
cell.ID:  E18_2_C74
cell.ID:  E18_2_C20
cell.ID:  E18_2_C90
cell.ID:  E18_2_C59
cell.ID:  E18_2_C32
cell.ID:  E18_2_C38
cell.ID:  E18_3_C55
cell.ID:  E18_2_C07
ID:  18.0_1
index:  5
P index:  4
ST:  18.0
T:  3
cell.ID:  E18_2_C72
cell.ID:  E18_3_C80
cell.ID:  E18_3_C82
cell.ID:  E18_1_C45
cell.ID:  E18_3_C23
cell.ID:  E18_3_C48
cell.ID:  E18_3_C41
cell.ID:  E18_2_C81
cell.ID:  E18_1_C08
cell.ID:  E18_1_C11
cell.ID:  E18_1_C13
cell.ID:  E18_2_C11
cell.ID:  E18_1_C44
cell.ID:  E18_3_C26
ID:  16.0_1
index:  6
P index:  1
ST:  16.0
T:  3
cell.ID:  E16_1_C15
cell.ID:  E16_1_C83
cell.ID:  E16_1_C56
cell.ID:  E16_1_C86
cell.ID:  E16_1_C57
cell.ID:  E16_1_C05
cell.ID:  E16_1_C74
ID:  18.0_3
index:  7
P index:  4
ST:  18.0
T:  3
cell.ID:  E18_2_C69
cell.ID:  E18_2_C55
cell.ID:  E18_1_C62
cell.ID:  E18_3_C90
cell.ID:  E18_2_C65
cell.ID:  E18_1_C38
cell.ID:  E18_1_C67
cell.ID:  E18_2_C49
cell.ID:  E18_1_C50
cell.ID:  E18_2_C63
cell.ID:  E18_3_C06
cell.ID:  E18_2_C06
cell.ID:  E18_2_C92
cell.ID:  E18_2_C61
cell.ID:  E18_1_C60
['0 1', '0 2', '0 3', '1 4', '4 5', '1 6', '4 7']
['E14_1_C44 0', 'E14_1_C68 0', 'E14_1_C84 0', 'E14_1_C70 0', 'E14_1_C90 0', 'E14_1_C01 0', 'E14_1_C25 0', 'E14_1_C51 0', 'E14_1_C10 0', 'E14_1_C06 0', 'E14_1_C69 0', 'E14_1_C89 0', 'E14_1_C75 0', 'E14_1_C16 0', 'E14_1_C31 0', 'E14_1_C76 0', 'E14_1_C78 0', 'E14_1_C74 0', 'E14_1_C14 0', 'E14_1_C50 0', 'E14_1_C83 0', 'E14_1_C95 0', 'E14_1_C65 0', 'E14_1_C54 0', 'E14_1_C39 0', 'E14_1_C02 0', 'E14_1_C85 0', 'E14_1_C94 0', 'E14_1_C09 0', 'E14_1_C46 0', 'E14_1_C60 0', 'E14_1_C22 0', 'E14_1_C05 0', 'E14_1_C18 0', 'E14_1_C73 0', 'E14_1_C57 0', 'E14_1_C77 0', 'E14_1_C81 0', 'E14_1_C87 0', 'E14_1_C61 0', 'E14_1_C79 0', 'E14_1_C66 0', 'E14_1_C96 0', 'E14_1_C56 0', 'E14_1_C80 0', 'E16_1_C90 1', 'E16_1_C04 1', 'E16_1_C94 1', 'E16_1_C68 1', 'E16_1_C72 1', 'E16_1_C01 1', 'E16_1_C06 1', 'E16_1_C14 1', 'E16_1_C40 1', 'E16_1_C65 1', 'E16_1_C66 1', 'E16_1_C54 1', 'E16_1_C24 1', 'E16_1_C84 1', 'E16_1_C67 1', 'E16_1_C95 1', 'E16_1_C31 1', 'E16_1_C88 1', 'E16_1_C49 1', 'E16_1_C92 1', 'E18_3_C51 2', 'E18_3_C75 2', 'E18_3_C13 2', 'E18_1_C18 3', 'E18_1_C15 3', 'E18_2_C77 3', 'E18_3_C40 3', 'E18_1_C47 3', 'E18_3_C53 3', 'E18_1_C53 3', 'E18_3_C36 3', 'E18_3_C15 3', 'E18_1_C46 3', 'E18_2_C73 4', 'E18_2_C88 4', 'E18_1_C84 4', 'E18_3_C37 4', 'E18_2_C43 4', 'E18_2_C87 4', 'E18_3_C61 4', 'E18_2_C42 4', 'E18_3_C09 4', 'E18_1_C34 4', 'E18_2_C22 4', 'E18_3_C54 4', 'E18_3_C10 4', 'E18_2_C15 4', 'E18_1_C37 4', 'E18_3_C59 4', 'E18_2_C44 4', 'E18_2_C26 4', 'E18_1_C33 4', 'E18_3_C87 4', 'E18_2_C12 4', 'E18_2_C30 4', 'E18_1_C09 4', 'E18_3_C29 4', 'E18_2_C36 4', 'E18_2_C31 4', 'E18_2_C50 4', 'E18_3_C28 4', 'E18_2_C13 4', 'E18_3_C47 4', 'E18_2_C74 4', 'E18_2_C20 4', 'E18_2_C90 4', 'E18_2_C59 4', 'E18_2_C32 4', 'E18_2_C38 4', 'E18_3_C55 4', 'E18_2_C07 4', 'E18_2_C72 5', 'E18_3_C80 5', 'E18_3_C82 5', 'E18_1_C45 5', 'E18_3_C23 5', 'E18_3_C48 5', 'E18_3_C41 5', 'E18_2_C81 5', 'E18_1_C08 5', 'E18_1_C11 5', 'E18_1_C13 5', 'E18_2_C11 5', 'E18_1_C44 5', 'E18_3_C26 5', 'E16_1_C15 6', 'E16_1_C83 6', 'E16_1_C56 6', 'E16_1_C86 6', 'E16_1_C57 6', 'E16_1_C05 6', 'E16_1_C74 6', 'E18_2_C69 7', 'E18_2_C55 7', 'E18_1_C62 7', 'E18_3_C90 7', 'E18_2_C65 7', 'E18_1_C38 7', 'E18_1_C67 7', 'E18_2_C49 7', 'E18_1_C50 7', 'E18_2_C63 7', 'E18_3_C06 7', 'E18_2_C06 7', 'E18_2_C92 7', 'E18_2_C61 7', 'E18_1_C60 7']

Train CSHMM model and select best iteration with cross validation, it will take some time. We set the sample points to 10 instead of 100 in the paper and number of genes to 3000 to save time here. The cross_validation setting can be turned on and the number of iteration can also be adjusted.

In [4]:
cmd = "python CSHMM_train.py \
--data_file treutlein2014 \
--structure_file init_cluster_treutlein2014.txt \
--n_split 100 -ng 16000 \
--n_iteration 2 \
--cross_validation 0 \
--random_seed 0 \
--model_name lung_developmental"
In [5]:
result_stdout = subprocess.check_output(cmd.split(" "))
result_stdout
Out[5]:
"Namespace(assign_by_prob_sampling=1, cluster_init=0, cross_validation=0, data_file='treutlein2014', data_file_testing=None, drop_out_param=0.1, k_param_range=5, lamb=1, lamb_data_mult='1', lamb_ratio_mult='1', model_name='lung_developmental', n_anchor=0, n_gene=16000, n_iteration=2, n_split=100, optimize_w=0, path_constraint=0, progress_bar=0, random_seed=0, structure_file='init_cluster_treutlein2014.txt')\nloading data......\n152  cell loaded with  15174  genes selected\ninitializing parameters and hidden variable with Juns model structure......\n['E14_1_C44', 'E14_1_C68', 'E14_1_C84', 'E14_1_C70', 'E14_1_C90', 'E14_1_C01', 'E14_1_C25', 'E14_1_C51', 'E14_1_C10', 'E14_1_C06', 'E14_1_C69', 'E14_1_C89', 'E14_1_C75', 'E14_1_C16', 'E14_1_C31', 'E14_1_C76', 'E14_1_C78', 'E14_1_C74', 'E14_1_C14', 'E14_1_C50', 'E14_1_C83', 'E14_1_C95', 'E14_1_C65', 'E14_1_C54', 'E14_1_C39', 'E14_1_C02', 'E14_1_C85', 'E14_1_C94', 'E14_1_C09', 'E14_1_C46', 'E14_1_C60', 'E14_1_C22', 'E14_1_C05', 'E14_1_C18', 'E14_1_C73', 'E14_1_C57', 'E14_1_C77', 'E14_1_C81', 'E14_1_C87', 'E14_1_C61', 'E14_1_C79', 'E14_1_C66', 'E14_1_C96', 'E14_1_C56', 'E14_1_C80', 'E16_1_C90', 'E16_1_C15', 'E16_1_C04', 'E16_1_C83', 'E16_1_C94', 'E16_1_C68', 'E16_1_C56', 'E16_1_C72', 'E16_1_C86', 'E16_1_C01', 'E16_1_C57', 'E16_1_C06', 'E16_1_C14', 'E16_1_C40', 'E16_1_C65', 'E16_1_C66', 'E16_1_C54', 'E16_1_C24', 'E16_1_C84', 'E16_1_C67', 'E16_1_C95', 'E16_1_C31', 'E16_1_C05', 'E16_1_C88', 'E16_1_C49', 'E16_1_C74', 'E16_1_C92', 'E18_2_C73', 'E18_2_C69', 'E18_1_C18', 'E18_2_C55', 'E18_1_C62', 'E18_2_C88', 'E18_3_C90', 'E18_1_C15', 'E18_1_C84', 'E18_3_C37', 'E18_2_C77', 'E18_2_C43', 'E18_2_C87', 'E18_3_C61', 'E18_2_C42', 'E18_3_C09', 'E18_3_C40', 'E18_2_C72', 'E18_1_C34', 'E18_2_C65', 'E18_2_C22', 'E18_3_C54', 'E18_3_C80', 'E18_1_C47', 'E18_1_C38', 'E18_1_C67', 'E18_3_C10', 'E18_3_C82', 'E18_3_C51', 'E18_2_C15', 'E18_2_C49', 'E18_1_C50', 'E18_1_C37', 'E18_3_C53', 'E18_3_C59', 'E18_2_C44', 'E18_2_C63', 'E18_3_C75', 'E18_2_C26', 'E18_1_C33', 'E18_3_C87', 'E18_3_C06', 'E18_1_C45', 'E18_2_C12', 'E18_1_C53', 'E18_2_C30', 'E18_1_C09', 'E18_3_C29', 'E18_2_C36', 'E18_2_C31', 'E18_3_C36', 'E18_2_C06', 'E18_2_C50', 'E18_3_C15', 'E18_3_C23', 'E18_3_C48', 'E18_3_C41', 'E18_3_C28', 'E18_2_C81', 'E18_2_C92', 'E18_3_C13', 'E18_2_C13', 'E18_3_C47', 'E18_1_C08', 'E18_1_C11', 'E18_2_C74', 'E18_2_C61', 'E18_2_C20', 'E18_1_C13', 'E18_1_C60', 'E18_2_C11', 'E18_2_C90', 'E18_2_C59', 'E18_1_C44', 'E18_2_C32', 'E18_3_C26', 'E18_2_C38', 'E18_1_C46', 'E18_3_C55', 'E18_2_C07']\nM-step: optimizing w param......\nconfussion matrix:\n['NA', 'AT1', 'Clara', 'BP', 'AT2', 'ciliated']\n[[ 45.   0.   0.   0.   0.   0.]\n [ 20.   0.   0.   0.   0.   0.]\n [  0.   0.   0.   0.   0.   3.]\n [  0.   0.  10.   0.   0.   0.]\n [  0.  28.   0.  10.   0.   0.]\n [  0.   0.   0.   2.  12.   0.]\n [  7.   0.   0.   0.   0.   0.]\n [  0.  13.   1.   1.   0.   0.]]\nARI:  0.456668716574\ntraining iteration:  1\ncell paths:  (array([0, 1, 2, 3, 4, 5, 6, 7]), array([45, 20,  3, 10, 38, 14,  7, 15]))\ncalculating  ALL  score......\nmodel score:  ('(AIC,BIC,G2,G3,G4,G5,G6)=', (-16184894.146563159, -17010812.680982944, -33359979.19642818, -22476604.740819864, -23857092.792738877, -21157568.492346037, -32815213.829627156))\nM-step: optimizing g param with CVX......\nafter M-step g_param full log-likelihood:  -4581344.01478\nM-step: optimizing sigma param......\nafter M-step sigma_param full log-likelihood:  -3682354.6769\nM-step: optimizing K param......\nafter M-step K_param full log-likelihood:  -3673142.7071\nE-step: assigning new path and time for cell......\nafter E-step full log-likelihood:  -3667538.77369\nadjusting model structure \n0 -> 1\n0 -> 2\n0 -> 3\n1 -> 4\n1 -> 6\n4 -> 5\n4 -> 7\nM-step: optimizing w param......\nafter setting w_nz full log-likelihood:  -3667538.77369\nafter setting trans_prob full log-likelihood:  -3667463.79814\nconfussion matrix:\n['NA', 'AT1', 'Clara', 'BP', 'AT2', 'ciliated']\n[[ 37.   0.   0.   0.   0.   0.]\n [ 22.   0.   0.   0.   0.   0.]\n [  1.   0.   0.   0.   0.   3.]\n [  3.   0.  10.   0.   0.   0.]\n [  2.  26.   0.  10.   0.   0.]\n [  0.   0.   0.   2.  12.   0.]\n [  7.   0.   0.   0.   0.   0.]\n [  0.  15.   1.   1.   0.   0.]]\nARI:  0.443207925463\nsaving model to file:  lung_developmental_it1.pickle\ntraining iteration:  2\ncell paths:  (array([0, 1, 2, 3, 4, 5, 6, 7]), array([37, 22,  4, 13, 38, 14,  7, 17]))\ncalculating  ALL  score......\nmodel score:  ('(AIC,BIC,G2,G3,G4,G5,G6)=', (-7881191.5962896235, -8707110.1307094097, -25056276.646154646, -14172902.19054633, -15553390.242465343, -12853865.942072503, -24511511.279353619))\nM-step: optimizing g param with CVX......\nafter M-step g_param full log-likelihood:  -3658954.71936\nM-step: optimizing sigma param......\nafter M-step sigma_param full log-likelihood:  -3658237.39742\nM-step: optimizing K param......\nafter M-step K_param full log-likelihood:  -3656224.85601\nE-step: assigning new path and time for cell......\nafter E-step full log-likelihood:  -3654789.73329\nadjusting model structure \n0 -> 1\n0 -> 2\n0 -> 3\n1 -> 4\n1 -> 6\n4 -> 5\n4 -> 7\nM-step: optimizing w param......\nafter setting w_nz full log-likelihood:  -3654789.73329\nafter setting trans_prob full log-likelihood:  -3654789.73329\nconfussion matrix:\n['NA', 'AT1', 'Clara', 'BP', 'AT2', 'ciliated']\n[[ 37.   0.   0.   0.   0.   0.]\n [ 22.   0.   0.   0.   0.   0.]\n [  1.   0.   0.   0.   0.   3.]\n [  3.   0.  10.   0.   0.   0.]\n [  2.  26.   0.  10.   0.   0.]\n [  0.   0.   0.   2.  12.   0.]\n [  7.   0.   0.   0.   0.   0.]\n [  0.  15.   1.   1.   0.   0.]]\nARI:  0.443207925463\npath assignment the same as previous iteration, stop training.\nsaving model to file:  lung_developmental_it2.pickle\nmaximum training iteration reached.\n"
In [6]:
print result_stdout
Namespace(assign_by_prob_sampling=1, cluster_init=0, cross_validation=0, data_file='treutlein2014', data_file_testing=None, drop_out_param=0.1, k_param_range=5, lamb=1, lamb_data_mult='1', lamb_ratio_mult='1', model_name='lung_developmental', n_anchor=0, n_gene=16000, n_iteration=2, n_split=100, optimize_w=0, path_constraint=0, progress_bar=0, random_seed=0, structure_file='init_cluster_treutlein2014.txt')
loading data......
152  cell loaded with  15174  genes selected
initializing parameters and hidden variable with Juns model structure......
['E14_1_C44', 'E14_1_C68', 'E14_1_C84', 'E14_1_C70', 'E14_1_C90', 'E14_1_C01', 'E14_1_C25', 'E14_1_C51', 'E14_1_C10', 'E14_1_C06', 'E14_1_C69', 'E14_1_C89', 'E14_1_C75', 'E14_1_C16', 'E14_1_C31', 'E14_1_C76', 'E14_1_C78', 'E14_1_C74', 'E14_1_C14', 'E14_1_C50', 'E14_1_C83', 'E14_1_C95', 'E14_1_C65', 'E14_1_C54', 'E14_1_C39', 'E14_1_C02', 'E14_1_C85', 'E14_1_C94', 'E14_1_C09', 'E14_1_C46', 'E14_1_C60', 'E14_1_C22', 'E14_1_C05', 'E14_1_C18', 'E14_1_C73', 'E14_1_C57', 'E14_1_C77', 'E14_1_C81', 'E14_1_C87', 'E14_1_C61', 'E14_1_C79', 'E14_1_C66', 'E14_1_C96', 'E14_1_C56', 'E14_1_C80', 'E16_1_C90', 'E16_1_C15', 'E16_1_C04', 'E16_1_C83', 'E16_1_C94', 'E16_1_C68', 'E16_1_C56', 'E16_1_C72', 'E16_1_C86', 'E16_1_C01', 'E16_1_C57', 'E16_1_C06', 'E16_1_C14', 'E16_1_C40', 'E16_1_C65', 'E16_1_C66', 'E16_1_C54', 'E16_1_C24', 'E16_1_C84', 'E16_1_C67', 'E16_1_C95', 'E16_1_C31', 'E16_1_C05', 'E16_1_C88', 'E16_1_C49', 'E16_1_C74', 'E16_1_C92', 'E18_2_C73', 'E18_2_C69', 'E18_1_C18', 'E18_2_C55', 'E18_1_C62', 'E18_2_C88', 'E18_3_C90', 'E18_1_C15', 'E18_1_C84', 'E18_3_C37', 'E18_2_C77', 'E18_2_C43', 'E18_2_C87', 'E18_3_C61', 'E18_2_C42', 'E18_3_C09', 'E18_3_C40', 'E18_2_C72', 'E18_1_C34', 'E18_2_C65', 'E18_2_C22', 'E18_3_C54', 'E18_3_C80', 'E18_1_C47', 'E18_1_C38', 'E18_1_C67', 'E18_3_C10', 'E18_3_C82', 'E18_3_C51', 'E18_2_C15', 'E18_2_C49', 'E18_1_C50', 'E18_1_C37', 'E18_3_C53', 'E18_3_C59', 'E18_2_C44', 'E18_2_C63', 'E18_3_C75', 'E18_2_C26', 'E18_1_C33', 'E18_3_C87', 'E18_3_C06', 'E18_1_C45', 'E18_2_C12', 'E18_1_C53', 'E18_2_C30', 'E18_1_C09', 'E18_3_C29', 'E18_2_C36', 'E18_2_C31', 'E18_3_C36', 'E18_2_C06', 'E18_2_C50', 'E18_3_C15', 'E18_3_C23', 'E18_3_C48', 'E18_3_C41', 'E18_3_C28', 'E18_2_C81', 'E18_2_C92', 'E18_3_C13', 'E18_2_C13', 'E18_3_C47', 'E18_1_C08', 'E18_1_C11', 'E18_2_C74', 'E18_2_C61', 'E18_2_C20', 'E18_1_C13', 'E18_1_C60', 'E18_2_C11', 'E18_2_C90', 'E18_2_C59', 'E18_1_C44', 'E18_2_C32', 'E18_3_C26', 'E18_2_C38', 'E18_1_C46', 'E18_3_C55', 'E18_2_C07']
M-step: optimizing w param......
confussion matrix:
['NA', 'AT1', 'Clara', 'BP', 'AT2', 'ciliated']
[[ 45.   0.   0.   0.   0.   0.]
 [ 20.   0.   0.   0.   0.   0.]
 [  0.   0.   0.   0.   0.   3.]
 [  0.   0.  10.   0.   0.   0.]
 [  0.  28.   0.  10.   0.   0.]
 [  0.   0.   0.   2.  12.   0.]
 [  7.   0.   0.   0.   0.   0.]
 [  0.  13.   1.   1.   0.   0.]]
ARI:  0.456668716574
training iteration:  1
cell paths:  (array([0, 1, 2, 3, 4, 5, 6, 7]), array([45, 20,  3, 10, 38, 14,  7, 15]))
calculating  ALL  score......
model score:  ('(AIC,BIC,G2,G3,G4,G5,G6)=', (-16184894.146563159, -17010812.680982944, -33359979.19642818, -22476604.740819864, -23857092.792738877, -21157568.492346037, -32815213.829627156))
M-step: optimizing g param with CVX......
after M-step g_param full log-likelihood:  -4581344.01478
M-step: optimizing sigma param......
after M-step sigma_param full log-likelihood:  -3682354.6769
M-step: optimizing K param......
after M-step K_param full log-likelihood:  -3673142.7071
E-step: assigning new path and time for cell......
after E-step full log-likelihood:  -3667538.77369
adjusting model structure 
0 -> 1
0 -> 2
0 -> 3
1 -> 4
1 -> 6
4 -> 5
4 -> 7
M-step: optimizing w param......
after setting w_nz full log-likelihood:  -3667538.77369
after setting trans_prob full log-likelihood:  -3667463.79814
confussion matrix:
['NA', 'AT1', 'Clara', 'BP', 'AT2', 'ciliated']
[[ 37.   0.   0.   0.   0.   0.]
 [ 22.   0.   0.   0.   0.   0.]
 [  1.   0.   0.   0.   0.   3.]
 [  3.   0.  10.   0.   0.   0.]
 [  2.  26.   0.  10.   0.   0.]
 [  0.   0.   0.   2.  12.   0.]
 [  7.   0.   0.   0.   0.   0.]
 [  0.  15.   1.   1.   0.   0.]]
ARI:  0.443207925463
saving model to file:  lung_developmental_it1.pickle
training iteration:  2
cell paths:  (array([0, 1, 2, 3, 4, 5, 6, 7]), array([37, 22,  4, 13, 38, 14,  7, 17]))
calculating  ALL  score......
model score:  ('(AIC,BIC,G2,G3,G4,G5,G6)=', (-7881191.5962896235, -8707110.1307094097, -25056276.646154646, -14172902.19054633, -15553390.242465343, -12853865.942072503, -24511511.279353619))
M-step: optimizing g param with CVX......
after M-step g_param full log-likelihood:  -3658954.71936
M-step: optimizing sigma param......
after M-step sigma_param full log-likelihood:  -3658237.39742
M-step: optimizing K param......
after M-step K_param full log-likelihood:  -3656224.85601
E-step: assigning new path and time for cell......
after E-step full log-likelihood:  -3654789.73329
adjusting model structure 
0 -> 1
0 -> 2
0 -> 3
1 -> 4
1 -> 6
4 -> 5
4 -> 7
M-step: optimizing w param......
after setting w_nz full log-likelihood:  -3654789.73329
after setting trans_prob full log-likelihood:  -3654789.73329
confussion matrix:
['NA', 'AT1', 'Clara', 'BP', 'AT2', 'ciliated']
[[ 37.   0.   0.   0.   0.   0.]
 [ 22.   0.   0.   0.   0.   0.]
 [  1.   0.   0.   0.   0.   3.]
 [  3.   0.  10.   0.   0.   0.]
 [  2.  26.   0.  10.   0.   0.]
 [  0.   0.   0.   2.  12.   0.]
 [  7.   0.   0.   0.   0.   0.]
 [  0.  15.   1.   1.   0.   0.]]
ARI:  0.443207925463
path assignment the same as previous iteration, stop training.
saving model to file:  lung_developmental_it2.pickle
maximum training iteration reached.

print model structure and cell assignment

In [7]:
model_file = "lung_developmental_it2.pickle"
In [8]:
#import CSHMM_train as ML
CSHMM_analysis.plot_path_fig(model_file,data_file,circle_size=20)
plotting path figure for model file:  lung_developmental_it2.pickle
loading model from file:  lung_developmental_it2.pickle
loading data......
152  cell loaded with  15174  genes selected
structure.png
<matplotlib.figure.Figure at 0x7f61b2b432d0>
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/matplotlib/figure.py:1743: UserWarning: This figure includes Axes that are not compatible with tight_layout, so its results might be incorrect.
  warnings.warn("This figure includes Axes that are not "

Load marker genes for lung and neuron dataset Note that you can change the init_marker_gene_dict function in CSHMM_analysis to your own set of markers

In [9]:
CSHMM_analysis.marker_gene_list,CSHMM_analysis.marker_genes_dict,CSHMM_analysis.treutlein2014_mkgene,CSHMM_analysis.treutlein2014_mkgene_dict,CSHMM_analysis.treutlein2016_mkgene,CSHMM_analysis.treutlein2016_mkgene_dict=CSHMM_analysis.init_marker_gene_dict()
['Top2a', 'Itgb4', 'Begfa', 'Cebpa', 'Sftpd', 'Sftpb', 'Sftpc', 'Muc1', 'Abca3', 'Foxj1', 'Ager', 'Pdpn', 'Id2', 'Scgb1a1', 'Aqp5', 'Cftr', 'Kyz2']
['Bex1', 'Snca', 'Mtap1a', 'Tnnc2', 'Sept4', 'Homer2', 'Sept3', 'Sox9', 'Ecm1', 'Sox2', 'Coro2b', 'Ascl1', 'Tubb3', 'Nestin', 'Atoh8', 'Dcn', 'Stmn3', 'Tcf12', 'Stxbp1', 'Dmpk', 'Sox11', 'Sv2a', 'Ppp3ca', 'St18', 'Snap25', 'Eln', 'Syt4', 'Scd1', 'Mapt', 'Vamp2', 'Nrxn3', 'Map2', 'Pax6', 'Cox8b', 'Hes1', 'Klhl24', 'Gria2', 'Fabp7', 'Ank2', 'Hes6', 'Camta1', 'Tcf4', 'Ube2c', 'Id3', 'Insm1', 'Cadm1', 'Acta1', 'Hmga2', 'Syp', 'Rab3c', 'Dner', 'Akap9', 'Myt1l', 'Zfp238', 'Myh3', 'Birc5', 'Gli3', 'Myo18b', 'Col1a2', 'Dlx3', 'Peg3']
61

Generate GO analysis files for each path

In [10]:
model_ana_temp=CSHMM_analysis.analyze_gene(model_file,data_file=data_file,out_folder = "lung_results")
analyzing for model file:  lung_developmental_it2.pickle
loading model from file:  lung_developmental_it2.pickle
loading data......
152  cell loaded with  15174  genes selected
rm -rf lung_results
mkdir lung_results
path:  0_2
path:  0_3
path:  0_1_4_5
path:  0_1_6
path:  0_1_4_7

Analyze the pair of paths to show differential expression genes and do GO analysis

In [11]:
model_ana_temp = CSHMM_analysis.analyze_path_difference(model_ana_temp,('AT1',(7,4)),('AT2',(5,4)),"")
        
('AT1', (7, 4))
('AT2', (5, 4))
0.927059324576 Sftpc ('AT2', 'treutlein2014: alveolar/bronchiolar lineages')
AT2|AT1_AT2__top0_scor_diff
0.862082678229 Fabp5 ('', '')
AT1_AT2__top1_scor_diff
0.836894024651 Slc34a2 ('', '')
AT1_AT2__top2_scor_diff
0.830315508775 Egfl6 ('', '')
AT1_AT2__top3_scor_diff
0.827688876267 Bex2 ('', '')
AT1_AT2__top4_scor_diff
0.819242209968 Lamp3 ('', '')
AT1_AT2__top5_scor_diff
0.805247441352 Dlk1 ('', '')
DEgenes|AT1_AT2__top6_scor_diff
0.804109922748 Cxcl15 ('', '')
AT1_AT2__top7_scor_diff
0.795489729951 Sftpa1 ('', '')
AT1_AT2__top8_scor_diff
0.794895639335 Hc ('', '')
AT1_AT2__top9_scor_diff
0.781678452761 Cd36 ('', '')
AT1_AT2__top10_scor_diff
0.778890633117 Soat1 ('', '')
AT1_AT2__top11_scor_diff
0.769967390012 Etv5 ('', '')
AT1_AT2__top12_scor_diff
0.761038236222 Chi3l1 ('', '')
AT1_AT2__top13_scor_diff
0.758304829942 Sftpb ('AT2', 'treutlein2014: alveolar/bronchiolar lineages')
AT2|AT1_AT2__top14_scor_diff
0.747573702527 S100g ('', '')
AT1_AT2__top15_scor_diff
0.746784562885 Glrx ('', '')
AT1_AT2__top16_scor_diff
0.743595397289 Ctsc ('', '')
AT1_AT2__top17_scor_diff
0.738200814494 Lcn2 ('', '')
AT1_AT2__top18_scor_diff
0.735863501062 Napsa ('', '')
AT1_AT2__top19_scor_diff
In [12]:
model_ana_temp = CSHMM_analysis.analyze_path_difference(model_ana_temp,('ciliated',tuple([2,0])),('Clara',tuple([3,0])),"")
('ciliated', (2, 0))
('Clara', (3, 0))
0.720554839912 Ptges3 ('', '')
ciliated_Clara__top0_scor_diff
0.685225234456 Upf3b ('', '')
ciliated_Clara__top1_scor_diff
0.608120073444 Mycbp ('', '')
ciliated_Clara__top2_scor_diff
0.599145104383 1110017D15Rik ('', '')
ciliated_Clara__top3_scor_diff
0.594972672984 Txndc12 ('', '')
ciliated_Clara__top4_scor_diff
0.592056501868 Dnajb6 ('', '')
ciliated_Clara__top5_scor_diff
0.58576447329 Tuba1b ('', '')
ciliated_Clara__top6_scor_diff
0.585742741719 Kndc1 ('', '')
ciliated_Clara__top7_scor_diff
0.5851712955 Psmg2 ('', '')
ciliated_Clara__top8_scor_diff
0.579617429488 Cspp1 ('', '')
ciliated_Clara__top9_scor_diff
0.577704946879 Oxa1l ('', '')
ciliated_Clara__top10_scor_diff
0.577498969979 Nedd8 ('', '')
ciliated_Clara__top11_scor_diff
0.57134881425 Ublcp1 ('', '')
ciliated_Clara__top12_scor_diff
0.570525674117 Zfp330 ('', '')
ciliated_Clara__top13_scor_diff
0.570525178154 Slc35b1 ('', '')
ciliated_Clara__top14_scor_diff
0.568352922551 Naa35 ('', '')
ciliated_Clara__top15_scor_diff
0.566970355828 Ttc18 ('', '')
ciliated_Clara__top16_scor_diff
0.566571836516 Psip1 ('', '')
ciliated_Clara__top17_scor_diff
0.560548717295 Gtl3 ('', '')
ciliated_Clara__top18_scor_diff
0.555839328627 Myh10 ('', '')
ciliated_Clara__top19_scor_diff

Plot continuous gene expression

In [13]:
CSHMM_analysis.plot_cont_marker_gexp(model_ana_temp,remove_path=tuple([6,1,0]))
['Psip1', 'Bex2', 'Bex1', 'Foxa2', 'Snca', 'Top2a', 'Itgb4', 'Ctsc', 'Mtap1a', 'Cd36', 'Tnnc2', 'Hes1', 'Homer2', 'Sept3', 'Begfa', 'Sox9', 'Dlk1', 'Ecm1', 'Sox2', 'Coro2b', 'Cebpa', 'Ptges3', 'Ascl1', 'Tubb3', 'Chi3l1', 'Sftpa1', 'Naa35', 'Cxcl15', 'Nestin', 'Gabpb1', 'Slc35b1', 'Cdc6', 'Atoh8', 'Dcn', 'Stmn3', 'Oxa1l', 'Dnajb6', 'Tcf12', 'Hc', 'Ttc18', 'Sftpd', 'Sftpb', 'Sftpc', 'Muc1', 'Stxbp1', 'Egfl6', 'Abca3', 'Dmpk', 'Gtl3', 'Foxj1', 'Sox11', 'Nedd8', 'Ager', 'Slc34a2', 'Cdk4', 'Sv2a', 'Ppp3ca', 'St18', 'Snap25', 'Eln', 'Syt4', 'Scd1', 'Pdpn', 'Mapt', 'Zfp330', 'Etv5', 'Ppp3r1', 'Jund', 'Myh10', 'Nasp', 'Vamp2', 'Lcn2', 'Nrxn3', 'Map2', 'Upf3b', 'Pax6', 'Cox8b', 'Sept4', 'Klhl24', 'Ublcp1', 'Gria2', 'Naa50', 'Txndc12', 'Fabp7', 'Ank2', 'Fabp5', 'Hes6', 'Camta1', 'Tcf4', 'Adamfs10', 'Ube2c', 'Id2', 'Id3', 'Insm1', 'Scgb1a1', 'Cadm1', 'Soat1', 'Tuba1b', 'Hmga2', 'Syp', 'Rab3c', 'Rars2', 'Glrx', 'Lamp3', 'S100g', 'Dner', '1110017D15Rik', 'Aqp5', 'Cspp1', 'Akap9', 'Acta1', 'Psmg2', 'Myt1l', 'Zfp238', 'Kndc1', 'Cftr', 'Kyz2', 'Myh3', 'Napsa', 'Hmgb2', 'Birc5', 'Myo1b', 'Mycbp', 'Gli3', 'Myo18b', 'Col1a2', 'Dlx3', 'Peg3']
0 ['NA']
1 ['NA' 'NA_16']
2 ['NA' 'ciliated']
3 ['Clara' 'NA']
4 ['AT1' 'BP' 'NA']
5 ['AT2' 'BP']
6 ['NA_16']
7 ['AT1' 'BP' 'Clara']
mk_gene_list: ['Psip1', 'Bex2', 'Bex1', 'Foxa2', 'Snca', 'Top2a', 'Itgb4', 'Ctsc', 'Mtap1a', 'Cd36', 'Tnnc2', 'Hes1', 'Homer2', 'Sept3', 'Begfa', 'Sox9', 'Dlk1', 'Ecm1', 'Sox2', 'Coro2b', 'Cebpa', 'Ptges3', 'Ascl1', 'Tubb3', 'Chi3l1', 'Sftpa1', 'Naa35', 'Cxcl15', 'Nestin', 'Gabpb1', 'Slc35b1', 'Cdc6', 'Atoh8', 'Dcn', 'Stmn3', 'Oxa1l', 'Dnajb6', 'Tcf12', 'Hc', 'Ttc18', 'Sftpd', 'Sftpb', 'Sftpc', 'Muc1', 'Stxbp1', 'Egfl6', 'Abca3', 'Dmpk', 'Gtl3', 'Foxj1', 'Sox11', 'Nedd8', 'Ager', 'Slc34a2', 'Cdk4', 'Sv2a', 'Ppp3ca', 'St18', 'Snap25', 'Eln', 'Syt4', 'Scd1', 'Pdpn', 'Mapt', 'Zfp330', 'Etv5', 'Ppp3r1', 'Jund', 'Myh10', 'Nasp', 'Vamp2', 'Lcn2', 'Nrxn3', 'Map2', 'Upf3b', 'Pax6', 'Cox8b', 'Sept4', 'Klhl24', 'Ublcp1', 'Gria2', 'Naa50', 'Txndc12', 'Fabp7', 'Ank2', 'Fabp5', 'Hes6', 'Camta1', 'Tcf4', 'Adamfs10', 'Ube2c', 'Id2', 'Id3', 'Insm1', 'Scgb1a1', 'Cadm1', 'Soat1', 'Tuba1b', 'Hmga2', 'Syp', 'Rab3c', 'Rars2', 'Glrx', 'Lamp3', 'S100g', 'Dner', '1110017D15Rik', 'Aqp5', 'Cspp1', 'Akap9', 'Acta1', 'Psmg2', 'Myt1l', 'Zfp238', 'Kndc1', 'Cftr', 'Kyz2', 'Myh3', 'Napsa', 'Hmgb2', 'Birc5', 'Myo1b', 'Mycbp', 'Gli3', 'Myo18b', 'Col1a2', 'Dlx3', 'Peg3']
Psip1 ('ciliated_Clara__top17_scor_diff', 'found by our model')
/usr0/home/chiehl1/venv/local/lib/python2.7/site-packages/matplotlib/pyplot.py:504: UserWarning: close('all') closes all existing figures
  warnings.warn("close('all') closes all existing figures")
<matplotlib.figure.Figure at 0x7f61b2f9b950>
<matplotlib.figure.Figure at 0x7f61b2b899d0>
Bex2 ('AT1_AT2__top4_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61b2ba4bd0>
<matplotlib.figure.Figure at 0x7f61a1e10710>
Bex1 ('Ascl1_target', 'treutlein2016: Extend Data Figure 3b')
<matplotlib.figure.Figure at 0x7f61a1ecfd90>
<matplotlib.figure.Figure at 0x7f61a18be6d0>
Foxa2 ('DEgenes', 'sabrina TASIC: t-test/Scell')
<matplotlib.figure.Figure at 0x7f61a19f4c90>
<matplotlib.figure.Figure at 0x7f61a1466450>
Top2a ('ciliated', 'treutlein2014: novel')
<matplotlib.figure.Figure at 0x7f61a14a4cd0>
<matplotlib.figure.Figure at 0x7f61a1f06c10>
Itgb4 ('ciliated', 'treutlein2014: novel')
<matplotlib.figure.Figure at 0x7f61a3bc0910>
<matplotlib.figure.Figure at 0x7f61a1ce6810>
Ctsc ('AT1_AT2__top17_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1d86d90>
<matplotlib.figure.Figure at 0x7f61a1faca50>
Mtap1a ('cytoskeletal_reorganization', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a1b8fed0>
<matplotlib.figure.Figure at 0x7f61a0ecfa50>
Cd36 ('AT1_AT2__top10_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a2133110>
<matplotlib.figure.Figure at 0x7f61a3e95c90>
Hes1 ('NPC', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61b8fee110>
<matplotlib.figure.Figure at 0x7f61a1475310>
Homer2 ('cytoskeletal_reorganization', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a19ec190>
<matplotlib.figure.Figure at 0x7f61a16a7b90>
Sept3 ('cytoskeletal_reorganization', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a186e610>
<matplotlib.figure.Figure at 0x7f61a80444d0>
Sox9 ('initial_factor|NPC', 'treutlein2016: Extend Data Figure 8c,main text')
<matplotlib.figure.Figure at 0x7f61a1b66a50>
<matplotlib.figure.Figure at 0x7f61a19fcad0>
Dlk1 ('DEgenes|AT1_AT2__top6_scor_diff', 'sabrina TASIC: Scell, found by our model')
<matplotlib.figure.Figure at 0x7f61a3d81450>
<matplotlib.figure.Figure at 0x7f61a3a983d0>
Ecm1 ('MEF', 'treutlein2016: Extend Data Figure 3b')
<matplotlib.figure.Figure at 0x7f61a1c5de90>
<matplotlib.figure.Figure at 0x7f61a1170550>
Sox2 ('canonical_NPC', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a19ab4d0>
<matplotlib.figure.Figure at 0x7f61a105a050>
Coro2b ('cytoskeletal_reorganization', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61b2d1d850>
<matplotlib.figure.Figure at 0x7f61a177e990>
Cebpa ('AT2', 'treutlein2014: novel')
<matplotlib.figure.Figure at 0x7f61a3ac2a90>
<matplotlib.figure.Figure at 0x7f61a1b45a50>
Ptges3 ('ciliated_Clara__top0_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1d06ed0>
<matplotlib.figure.Figure at 0x7f61a15dd210>
Ascl1 ('Ascl1_target|initial_factor', 'treutlein2016: Extend Data Figure 3b, 8c')
<matplotlib.figure.Figure at 0x7f61a0e01c10>
<matplotlib.figure.Figure at 0x7f61a3e59510>
Chi3l1 ('AT1_AT2__top13_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a3c3ae50>
<matplotlib.figure.Figure at 0x7f61a1fa3e90>
Sftpa1 ('AT1_AT2__top8_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a3cc98d0>
<matplotlib.figure.Figure at 0x7f61a1869b50>
Naa35 ('ciliated_Clara__top15_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a195b450>
<matplotlib.figure.Figure at 0x7f61a1502e10>
Cxcl15 ('AT1_AT2__top7_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a19ee510>
<matplotlib.figure.Figure at 0x7f61a3baee50>
Gabpb1 ('DEgenes', 'sabrina TASIC: t-test/Scell')
<matplotlib.figure.Figure at 0x7f61a0f2a0d0>
<matplotlib.figure.Figure at 0x7f61a3e493d0>
Slc35b1 ('ciliated_Clara__top14_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a8a13490>
<matplotlib.figure.Figure at 0x7f61a0f4e390>
Cdc6 ('DEgenes', 'sabrina TASIC: Scell')
<matplotlib.figure.Figure at 0x7f61a15f8850>
<matplotlib.figure.Figure at 0x7f61a125b550>
Atoh8 ('initial_factor', 'treutlein2016: Extend Data Figure 8c')
<matplotlib.figure.Figure at 0x7f61a1678c90>
<matplotlib.figure.Figure at 0x7f61a0f43850>
Dcn ('MEF|Fibroblast', 'treutlein2016: Extend Data Figure 3b,6i')
<matplotlib.figure.Figure at 0x7f61a1bcda90>
<matplotlib.figure.Figure at 0x7f61a1119b50>
Stmn3 ('Neuron|synaptic', 'treutlein2016: Extend Data Figure 6h, main text')
<matplotlib.figure.Figure at 0x7f61a1126950>
<matplotlib.figure.Figure at 0x7f61a1dbe590>
Oxa1l ('ciliated_Clara__top10_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1e15b50>
<matplotlib.figure.Figure at 0x7f61a0ecaed0>
Dnajb6 ('ciliated_Clara__top5_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a3b9ad10>
<matplotlib.figure.Figure at 0x7f61a13c0e90>
Tcf12 ('initial_factor', 'treutlein2016: Extend Data Figure 8c')
<matplotlib.figure.Figure at 0x7f61a12fc450>
<matplotlib.figure.Figure at 0x7f61a134abd0>
Hc ('AT1_AT2__top9_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1055b10>
<matplotlib.figure.Figure at 0x7f61a1f595d0>
Ttc18 ('ciliated_Clara__top16_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a2208290>
<matplotlib.figure.Figure at 0x7f620ab97510>
Sftpd ('AT2', 'treutlein2014: novel')
<matplotlib.figure.Figure at 0x7f61b2b89750>
<matplotlib.figure.Figure at 0x7f61a13e1990>
Sftpb ('AT2|AT1_AT2__top14_scor_diff', 'treutlein2014: alveolar/bronchiolar lineages, found by our model')
<matplotlib.figure.Figure at 0x7f61a3dec290>
<matplotlib.figure.Figure at 0x7f61a1e0d6d0>
Sftpc ('AT2|AT1_AT2__top0_scor_diff', 'treutlein2014: alveolar/bronchiolar lineages, found by our model')
<matplotlib.figure.Figure at 0x7f61a1be3290>
<matplotlib.figure.Figure at 0x7f61a155b290>
Muc1 ('AT2', 'treutlein2014: alveolar/bronchiolar lineages')
<matplotlib.figure.Figure at 0x7f61a1516550>
<matplotlib.figure.Figure at 0x7f61a1516b10>
Stxbp1 ('synaptic_transmission', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a1bf8810>
<matplotlib.figure.Figure at 0x7f61a3b61a10>
Egfl6 ('AT1_AT2__top3_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a15869d0>
<matplotlib.figure.Figure at 0x7f61a10c6510>
Abca3 ('AT2', 'treutlein2014: alveolar/bronchiolar lineages')
<matplotlib.figure.Figure at 0x7f61a16c1290>
<matplotlib.figure.Figure at 0x7f61a184b990>
Dmpk ('synaptic_transmission', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a3e82d10>
<matplotlib.figure.Figure at 0x7f61a0e98050>
Gtl3 ('ciliated_Clara__top18_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a0e8a810>
<matplotlib.figure.Figure at 0x7f61a1f83b10>
Foxj1 ('ciliated', 'treutlein2014: alveolar/bronchiolar lineages')
<matplotlib.figure.Figure at 0x7f61a8a7b950>
<matplotlib.figure.Figure at 0x7f61a3d236d0>
Sox11 ('initial_factor', 'treutlein2016: Extend Data Figure 8c')
<matplotlib.figure.Figure at 0x7f61a0fa4350>
<matplotlib.figure.Figure at 0x7f61a2210c50>
Nedd8 ('ciliated_Clara__top11_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61b8b5e550>
<matplotlib.figure.Figure at 0x7f61a80f02d0>
Ager ('AT1', 'treutlein2014: alveolar/bronchiolar lineages')
<matplotlib.figure.Figure at 0x7f61a3cf65d0>
<matplotlib.figure.Figure at 0x7f61a1e841d0>
Slc34a2 ('AT1_AT2__top2_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1dc6b90>
<matplotlib.figure.Figure at 0x7f61a1a2d750>
Cdk4 ('DEgenes', 'sabrina TASIC: t-test/Scell')
<matplotlib.figure.Figure at 0x7f61a0fc3610>
<matplotlib.figure.Figure at 0x7f61a3e12d50>
Sv2a ('synaptic_maturation', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a18c78d0>
<matplotlib.figure.Figure at 0x7f61a1664110>
Ppp3ca ('synaptic_transmission', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a1671c10>
<matplotlib.figure.Figure at 0x7f61a0ea4cd0>
Snap25 ('Neuron|synaptic|synaptic_maturation', 'treutlein2016: Extend Data Figure 6h, main text')
<matplotlib.figure.Figure at 0x7f61a105d7d0>
<matplotlib.figure.Figure at 0x7f61a3dd5950>
Eln ('Fibroblast', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a16e7650>
<matplotlib.figure.Figure at 0x7f61a8a35990>
Scd1 ('MEF', 'treutlein2016: Extend Data Figure 3b')
<matplotlib.figure.Figure at 0x7f61a0f45690>
<matplotlib.figure.Figure at 0x7f61a3cc1190>
Pdpn ('AT1', 'treutlein2014: alveolar/bronchiolar lineages')
<matplotlib.figure.Figure at 0x7f61a225eed0>
<matplotlib.figure.Figure at 0x7f61a15a1810>
Mapt ('neural_projections', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a11d4110>
<matplotlib.figure.Figure at 0x7f61a15a1ad0>
Zfp330 ('ciliated_Clara__top13_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a8a7cd90>
<matplotlib.figure.Figure at 0x7f61a102ee50>
Etv5 ('AT1_AT2__top12_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a0fce150>
<matplotlib.figure.Figure at 0x7f61a1ebd690>
Ppp3r1 ('DEgenes', 'sabrina TASIC: t-test')
<matplotlib.figure.Figure at 0x7f61a0e12f50>
<matplotlib.figure.Figure at 0x7f61a0e3ff90>
Jund ('DEgenes', 'sabrina TASIC: t-test/Scell')
<matplotlib.figure.Figure at 0x7f61a15e4c90>
<matplotlib.figure.Figure at 0x7f61a3b08650>
Myh10 ('ciliated_Clara__top19_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1fcfe90>
<matplotlib.figure.Figure at 0x7f61b8b5e990>
Nasp ('DEgenes', 'sabrina TASIC: t-test/')
<matplotlib.figure.Figure at 0x7f61a8a2bf10>
<matplotlib.figure.Figure at 0x7f61a18b2150>
Vamp2 ('synaptic_transmission', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a18bbb10>
<matplotlib.figure.Figure at 0x7f61a184c2d0>
Lcn2 ('AT1_AT2__top18_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a11ba390>
<matplotlib.figure.Figure at 0x7f61a3b3e1d0>
Nrxn3 ('Neuron|synaptic|synaptic_maturation', 'treutlein2016: Extend Data Figure 6h, main text')
<matplotlib.figure.Figure at 0x7f61a82a4190>
<matplotlib.figure.Figure at 0x7f61a8046190>
Upf3b ('ciliated_Clara__top1_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a3cddb10>
<matplotlib.figure.Figure at 0x7f61a3ccb9d0>
Pax6 ('canonical_NPC', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a159a9d0>
<matplotlib.figure.Figure at 0x7f61a1b76f90>
Sept4 ('cytoskeletal_reorganization', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a18d1fd0>
<matplotlib.figure.Figure at 0x7f61a0ea2810>
Klhl24 ('neural_projections', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a1232c90>
<matplotlib.figure.Figure at 0x7f61a8a7b3d0>
Ublcp1 ('ciliated_Clara__top12_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1c001d0>
<matplotlib.figure.Figure at 0x7f61a1a3ee50>
Gria2 ('Neuron|synaptic|synaptic_maturation', 'treutlein2016: Extend Data Figure 6h, main text')
<matplotlib.figure.Figure at 0x7f61a162ad90>
<matplotlib.figure.Figure at 0x7f61a8a2f9d0>
Naa50 ('DEgenes', 'sabrina TASIC: t-test/Scell')
<matplotlib.figure.Figure at 0x7f61a3ac0050>
<matplotlib.figure.Figure at 0x7f61a11adf50>
Txndc12 ('ciliated_Clara__top4_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a18da390>
<matplotlib.figure.Figure at 0x7f61a3c5a810>
Fabp7 ('NPC', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a3c58650>
<matplotlib.figure.Figure at 0x7f61a1422f90>
Ank2 ('cytoskeletal_reorganization', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a1ae9950>
<matplotlib.figure.Figure at 0x7f61b2c14510>
Fabp5 ('AT1_AT2__top1_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1e1d810>
<matplotlib.figure.Figure at 0x7f61a805bb50>
Hes6 ('Ascl1_target|initial_factor', 'treutlein2016: Extend Data Figure 3b, 8c')
<matplotlib.figure.Figure at 0x7f61a15b35d0>
<matplotlib.figure.Figure at 0x7f61a128ef10>
Camta1 ('Neuron', 'treutlein2016: Extend Data Figure 8d')
<matplotlib.figure.Figure at 0x7f61a12aa250>
<matplotlib.figure.Figure at 0x7f61a1858090>
Tcf4 ('initial_factor', 'treutlein2016: Extend Data Figure 8c')
<matplotlib.figure.Figure at 0x7f61a1d72b90>
<matplotlib.figure.Figure at 0x7f61a3bbea50>
Ube2c ('mitosis', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a1273e90>
<matplotlib.figure.Figure at 0x7f61a80d3d10>
Id2 ('AT2', 'treutlein2014: novel')
<matplotlib.figure.Figure at 0x7f61a201df90>
<matplotlib.figure.Figure at 0x7f61a18a4f50>
Id3 ('MEF_factors', 'treutlein2016: Extend Data Figure 8e')
<matplotlib.figure.Figure at 0x7f61a1e3b4d0>
<matplotlib.figure.Figure at 0x7f61a1bc8950>
Insm1 ('Neuron', 'treutlein2016: Extend Data Figure 8d')
<matplotlib.figure.Figure at 0x7f61a1913750>
<matplotlib.figure.Figure at 0x7f61b2a7a3d0>
Scgb1a1 ('Clara', 'treutlein2014: alveolar/bronchiolar lineages')
<matplotlib.figure.Figure at 0x7f61a1765790>
<matplotlib.figure.Figure at 0x7f61a80ae2d0>
Cadm1 ('neural_projections', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a136c7d0>
<matplotlib.figure.Figure at 0x7f61a0dad210>
Soat1 ('AT1_AT2__top11_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a15b6950>
<matplotlib.figure.Figure at 0x7f61a16438d0>
Tuba1b ('ciliated_Clara__top6_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1271850>
<matplotlib.figure.Figure at 0x7f61a8a7a210>
Hmga2 ('MEF_factors|mitosis', 'treutlein2016: Extend Data Figure 8e, main text')
<matplotlib.figure.Figure at 0x7f61a3dc5a10>
<matplotlib.figure.Figure at 0x7f61a1199510>
Syp ('Neuron|synaptic_maturation', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a1008d50>
<matplotlib.figure.Figure at 0x7f61a1987750>
Rab3c ('synaptic_maturation', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a1750f90>
<matplotlib.figure.Figure at 0x7f61a18b2390>
Rars2 ('DEgenes', 'sabrina TASIC: t-test/Scell')
<matplotlib.figure.Figure at 0x7f61a1008050>
<matplotlib.figure.Figure at 0x7f61a19f4710>
Glrx ('AT1_AT2__top16_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a0eeee10>
<matplotlib.figure.Figure at 0x7f61a1150c10>
Lamp3 ('AT1_AT2__top5_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1bfbb50>
<matplotlib.figure.Figure at 0x7f61a15aa410>
S100g ('AT1_AT2__top15_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1a285d0>
<matplotlib.figure.Figure at 0x7f61a3b95910>
Dner ('Ascl1_target|neural_projections', 'treutlein2016: Extend Data Figure 3b, main text')
<matplotlib.figure.Figure at 0x7f61a1a11f10>
<matplotlib.figure.Figure at 0x7f61a80fb310>
1110017D15Rik ('ciliated_Clara__top3_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61b8b20cd0>
<matplotlib.figure.Figure at 0x7f61a1116ed0>
Aqp5 ('AT1', 'treutlein2014: alveolar/bronchiolar lineages')
<matplotlib.figure.Figure at 0x7f61a2063ed0>
<matplotlib.figure.Figure at 0x7f61a3d3cc50>
Cspp1 ('ciliated_Clara__top9_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1402e90>
<matplotlib.figure.Figure at 0x7f61a113cd10>
Akap9 ('cytoskeletal_reorganization', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a3bd16d0>
<matplotlib.figure.Figure at 0x7f61a3c0dd90>
Acta1 ('Myocyte', 'treutlein2016: Extend Data Figure 6g')
<matplotlib.figure.Figure at 0x7f61a17c3ed0>
<matplotlib.figure.Figure at 0x7f61a1ada810>
Psmg2 ('ciliated_Clara__top8_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a17b6950>
<matplotlib.figure.Figure at 0x7f61a3a92110>
Zfp238 ('Ascl1_target|initial_factor', 'treutlein2016: Extend Data Figure 3b, 8c')
<matplotlib.figure.Figure at 0x7f61a0dcd6d0>
<matplotlib.figure.Figure at 0x7f61a3bb8b50>
Kndc1 ('ciliated_Clara__top7_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a829c650>
<matplotlib.figure.Figure at 0x7f61a8a07a50>
Cftr ('AT2', 'treutlein2014: novel')
<matplotlib.figure.Figure at 0x7f61a1110410>
<matplotlib.figure.Figure at 0x7f61a1b7bb50>
Napsa ('AT1_AT2__top19_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a3b7dc90>
<matplotlib.figure.Figure at 0x7f61b8b5ee50>
Hmgb2 ('DEgenes', 'sabrina TASIC: Scell')
<matplotlib.figure.Figure at 0x7f61b2f87350>
<matplotlib.figure.Figure at 0x7f61a1e0e6d0>
Birc5 ('mitosis', 'treutlein2016: , main text')
<matplotlib.figure.Figure at 0x7f61a1de1850>
<matplotlib.figure.Figure at 0x7f61a201f990>
Myo1b ('DEgenes', 'sabrina TASIC: t-test/Scell')
<matplotlib.figure.Figure at 0x7f61a180ce50>
<matplotlib.figure.Figure at 0x7f61a0f09710>
Mycbp ('ciliated_Clara__top2_scor_diff', 'found by our model')
<matplotlib.figure.Figure at 0x7f61a1228a10>
<matplotlib.figure.Figure at 0x7f61a829f550>
Gli3 ('NPC', 'treutlein2016: main text')
<matplotlib.figure.Figure at 0x7f61a3bdee10>
<matplotlib.figure.Figure at 0x7f61a164add0>
Col1a2 ('Fibroblast', 'treutlein2016: Extend Data Figure 6i')
<matplotlib.figure.Figure at 0x7f61a1f8c950>
<matplotlib.figure.Figure at 0x7f61a0ff1550>
Dlx3 ('initial_factor', 'treutlein2016: Extend Data Figure 8c')
<matplotlib.figure.Figure at 0x7f61a3be5a10>
<matplotlib.figure.Figure at 0x7f61a0fe7810>
Peg3 ('Neuron', 'treutlein2016: Extend Data Figure 8d')
<matplotlib.figure.Figure at 0x7f61a1afac50>
<matplotlib.figure.Figure at 0x7f61a1c4f4d0>